Communications Psychology — Latest Matching Preprints

1

Confirmation Bias Exists in the Face of False Information

Razi, H.; Sambrook, T.; Garrett, N.

2026-05-11 neuroscience 10.64898/2026.05.07.723487 medRxiv

Top 0.1%

4.4%

Show abstract

Confirmation bias impacts judgments and decisions across a range of domains including finance, policy and science. Here we examine whether explicitly labelling information as true or false disrupts a core underlying computational mechanism that can generate this pervasive bias - asymmetric learning. Human participants (Study 1: N=47; Study 2: N=57) completed a 2 alternative forced choice (2AFC) task previously used to test for the presence of confirmation bias. Participants made choices between pairs of options that could win or lose money and received either factual or counterfactual feedback after each choice. We introduced a key novel feature into the task - providing explicit cues that signalled to participants whether feedback they had seen was true (verified) or false (debunked). Learning in response to feedback was attenuated under false compared to true labels but was present under both. Fitting participants choices to computational models enabled us to examine how sensitivity to the feedback varied as a function of both the label (true/false) and confirmation (confirmatory/disconfirmatory). This revealed a distinct pattern of learning rates typical of confirmation bias (enhanced learning from positive prediction errors for chosen options and from negative prediction errors for unchosen options) in response to both true and false labels. The findings highlight how confirmation bias plays an important role in the effectiveness of interventions designed to verify true and/or debunk false claims. Verification is less likely to succeed when information disconfirms prior beliefs. Conversely, debunking false claims is unlikely to succeed when the information confirms ones prior beliefs.

2

Dissociable computational markers of semantic search and verbal retrieval drive across the psychosis spectrum

Hüppi, R. M.; Surbeck, W.; Pauli, Y. L.; Dannecker, N.; Fabian, D.; Edkins, V.; Just, S. A.; Denier, N.; Bracht, T.; Stein, F.; Mülfarth, R. R.; Seuffert, S.; Kircher, T.; Sommer, I. E.; Hinzen, W.; Homan, P.

2026-05-21 psychiatry and clinical psychology 10.64898/2026.05.18.26353478 medRxiv

Top 0.1%

2.9%

Show abstract

Formal thought disorder (FTD) is a core psychosis feature. Disentangling its dimensions requires tasks simple enough for formal modeling yet sensitive enough to capture individual variation across the psychosis spectrum. The semantic verbal fluency task offers precisely this: a structured behavioral trace of semantic memory sampling, amenable to computational analysis using distributed word embeddings. We hypothesized that this sampling process is governed by two dissociable mechanisms mapping onto FTD dimensions: initial retrieval drive (d0), quantifying the motivational resource sustaining production, and semantic search precision (), quantifying how strongly similarity to the preceding word constrains each retrieval step from near-random to highly structured. We hypothesized that reduced d0 would track negative psychosis symptoms and alogia, while degraded would track language disorganization and left inferior longitudinal fasciculus (ILF) fractional anisotropy. We tested these predictions in a primary (N = 120) and an independent replication sample (N = 249) of German-speaking individuals across the psychosis spectrum. Both parameters decreased with greater psychosis severity and, in the primary sample, they dissociated regarding their clinical correlates. d0 correlated negatively with negative symptoms, general psychopathology, and poverty of speech, consistent with a computational signature of alogia. correlated negatively with positive symptoms and cognitive flexibility, and, in individuals with psychosis, positively with left ILF fractional anisotropy. The association between d0 and negative symptoms was replicated in the independent sample. These findings pave the way for mechanistic, automatically derived FTD markers capturing subclinical variation across the psychosis spectrum and mapping onto underlying cognitive and neural processes.

3

Decoupling smoothness, accuracy, and kinematic invariance in biological reach: an ablation study of an equilibrium-point controller in a 34-muscle arm model

Kobayashi, J.

2026-05-05 neuroscience 10.64898/2026.05.01.722167 medRxiv

Top 0.1%

2.6%

Show abstract

Engineering controllers solve musculoskeletal reaching but typically violate the kinematic invariants of human reach: bell-shaped speed profiles, near-straight paths, and a peak-velocity time at 40-50 % of movement duration. For the MyoSuite myoArm (20-DoF, 34 Hill-type muscles) we implement a biologically motivated controller combining (i) Feldmans {lambda}-equilibrium-point hypothesis, (ii) a minimum-jerk virtual trajectory{lambda} (t), (iii) a 200 ms visuomotor correction, and (iv) {gamma}-compatible spinal reflexes (Ia, Ib, reciprocal inhibition). Across n = 50 randomised targets the full controller is practically equivalent to an endpoint-PD + spinal baseline (Cartesian PD descending command paired with the same spinal reflex layer; see [§]2.6) on minimum tip error (Cohens d = +0.03; paired Wilcoxon detects only a +10.6 mm residual against a {approx} 100 mm absolute error, well within a pre-defined {+/-}20 mm equivalence margin) while halving peak speed (1.78 vs 3.90 m s-1, d = -7.39, p < 10-15) and reducing jerk by 40 % (d = -1.74). Only the variant with stretch reflexes brings the velocity-peak ratio into the canonical human range (0.40-0.50). Straightness stays below the human reference, so we frame the result as a partial reproduction of the bell-shape and smoothness invariants, not full human-like reach. A factorial ablation (n = 20) decomposes the contributions: virtual trajectory primarily controls smoothness, visuomotor feedback primarily controls accuracy, and reflexes primarily control velocity-peak timing, with two quantifiable secondary effects reported explicitly. An attempted online cerebellar correction in joint or {lambda} space did not improve performance, consistent with -- but not by itself demonstrating -- the cerebellum as a slow inverse-model learner rather than a within-trial steering controller. We release a deterministic_reset patch for a seeding bug in the MyoSuite reach environments (in the versions tested). The result is mechanistic rather than task-optimal: it attributes separable kinematic axes to distinct biological control layers in a 34-muscle arm.

4

Motor abstraction training generalizes to the refinement of specific movement patterns

Sun, Z.; Xie, Z.; McDougle, S.

2026-05-08 neuroscience 10.64898/2026.05.05.722946 medRxiv

Top 0.1%

2.2%

Show abstract

The ability to store abstract mental representations underlies generalization across virtually every domain of human cognition, from vision and language to concept learning. Yet whether the motor system generates such abstractions and whether they causally contribute to skill learning remain open questions. Here, we introduce a paradigm in which human participants learned to refine novel movement patterns by learning to precisely copy unfamiliar handwritten characters. To examine the role of motor abstractions in this form of motor learning, participants were trained on markedly rotated versions of the characters, which recruited vastly different muscle commands while still maintaining the relevant abstract movement trajectory. Across eight experiments, abstraction training drove robust skill improvements that were comparable to having repetitive practice on the canonical form of each novel character. Moreover, this learning was motoric in nature: it required neither visual feedback nor visual mental imagery and was sensitive to the sequential structure of the abstract movement trajectory. These findings establish a causal role for abstract representations in motor learning, revealing that the motor system likely deploys abstractions in the earliest stages of skill acquisition.

5

Categorical Bayes Filtering for Computational Phenotyping in Adaptive Learning

Chen, J.; Piray, P.

2026-05-18 neuroscience 10.64898/2026.05.14.725268 medRxiv

Top 0.1%

2.1%

Show abstract

Adaptive learning requires distinguishing environmental volatility from observation stochasticity, two sources of uncertainty that demand opposite adjustments to the learning rate but inflate experienced variance similarly. Disentangling them is computationally difficult with no tractable closed-form solution. Particle-filter methods are the natural tool for this kind of joint inference, but their stochastic likelihoods and non-differentiable objectives force derivative-free fitting protocols and discourage the individual-difference analyses central to cognitive modeling, where small effect sizes leave little room for additional estimator noise. We introduce the Categorical Bayes Filter (CBF), a deterministic alternative that preserves the conditional structure of recent particle-filter accounts but replaces the stochastic outer layer with a categorical distribution on a quantile grid parameterized through differentiable Beta quantile functions. The procedure performs evidence maximization with an exact, deterministic marginal likelihood that is fully differentiable in the grid parameters. In a volatility-stochasticity task with N = 643 participants, fitted CBF dispersion parameters reveal a cross-over phenotyping pattern between volatility-blind and stochasticity-blind subjects that is not recoverable from particle-filter parameters fit to the same data under a state-of-the-art protocol. The deterministic structure also yields a trial-by-trial ambiguity signal that predicts response times not used in fitting. More broadly, the approach opens individual-level analyses in cognitive modeling and computational psychiatry that stochastic methods have effectively foreclosed.

6

Incomplete letter recognition is limited by cortical and not optical factors: Simulating the visual deficits of dementia in healthy adults

Huang, Z.; Dekker, T. M.; Crutch, S. J.; Yong, K. X. X.; Greenwood, J. A.

2026-05-20 neuroscience 10.64898/2026.05.18.725904 medRxiv

Top 0.1%

2.0%

Show abstract

Incomplete letter recognition tasks are frequently used to detect visual deficits arising from neurodegenerative syndromes, including Posterior Cortical Atrophy (PCA; visual-variant Alzheimers disease). A recent development of this approach is the Graded Incomplete Letters Test (GILT), which measures recognition thresholds for letters degraded by removing pixelated sections (decreasing completeness). Although GILT thresholds are strongly elevated in PCA relative to typical adults, the precise cortical visual impairments underlying these deficits are unclear, as is the potential contribution from age-related optical limitations. We compared candidate cortical factors (crowding and global integration) with optical limitations (blur and low contrast) by simulating these factors in typical adults (n=6) viewing incomplete letter stimuli. Participants identified foveally presented letters (12 alternatives), with completeness varied using QUEST. At baseline, thresholds averaged [~]5% completeness. Optical factors were simulated by separately applying blur and lowered contrast. These factors had minimal effect on thresholds, except where blur/contrast levels approached visibility limits, where thresholds rose modestly but remained far below clinical levels in PCA. Cortical factors were simulated by increasing crowding (disruptions from clutter) through peripheral presentation, with global-integration impairments simulated by varying pixel size to alter the distribution of degradation (limiting spatial integration) or degrading letters dynamically with limited-lifetime pixels (limiting temporal integration). These manipulations substantially elevated thresholds, with combined crowding and global-integration impairments increasing thresholds to levels comparable with PCA. We conclude that impaired incomplete letter recognition is driven primarily by cortical rather than optical factors, and that neurodegenerative deficits may reflect the combined impact of multiple cortical limitations.

7

An ordinal Language of Thought supports human memory for regular sequences

Tabbane, E.; Figueira, S.; Benjamin, L.; Dehaene, S.; Al Roumi, F.

2026-05-15 neuroscience 10.64898/2026.05.14.725160 medRxiv

Top 0.1%

1.9%

Show abstract

How do humans store sequences that far exceed working memory capacity? Using visuo-spatial and binary auditory sequences, we previously showed that a Language of Thought (LoT) architecture -- in which simple primitives are recursively combined into hierarchical programs -- enables efficient storage of structured sequences. Here we ask whether this principle extends to purely ordinal structure: sequences defined by how items repeat and in what order, as in AABBCCAABBCC, independently of their spatial content. Across three experiments, participants reproduced 12-item sequences of spatial locations with various ordinal structures. The minimal description length derived from the LoT model predicted recall accuracy with remarkable precision (r = .96), substantially outperforming Shannon entropy, Lempel-Ziv complexity, chunking models and subjective complexity ratings. Critically, fine-grained analyses of participants inter-click intervals during reproduction revealed systematic slowdowns at the hierarchical boundaries predicted by the LoT programs, providing a behavioral signature of the underlying mental syntax. These results identify a compact vocabulary of mental primitives -- repetition, mirroring, and interleaving -- whose composition accounts for the symbolic compression of ordinal structures. For ordinal regularities, human sequence memory operates as a form of program induction, leveraging a domain-general capacity for hierarchical compression to encode complex structured information. Author SummaryHuman short-term memory is heavily limited, holding no more than a few items at once. Yet humans routinely memorize complex sequences that far exceed this capacity. How is this possible? We propose that the brain acts like a programmer: rather than storing each element individually, it compresses sequences into short mental "programs." Just as a programmer writes "repeat ABC four times" instead of typing ABCABCABCABC, the brain leverages regularities such as repetitions (ABC-ABC) or mirror patterns (ABC-CBA) to encode sequences efficiently. We tested this idea across three experiments: two in which participants memorized and reproduced sequences of spatial positions on a screen, one where they only rated their perceived complexity. Sequences described by shorter programs were remembered far better and judged as simpler -- even when they were the same length as less structured sequences. When reproducing sequences, participants paused longer at structural boundaries, revealing the internal organization of their mental programs. Strikingly, program length predicted memory performance better than participants own complexity ratings, suggesting that these mental representations are not fully accessible to conscious awareness. Finally, we identified key new patterns -- including temporal inversion and interleaving -- that extend the Language of Thought framework. Together, these findings suggest that a compositional Language of Thought is a fundamental aspect of how the human brain efficiently store and represent structured information.

8

An Operant-based Touchscreen Morph Discrimination Task Does Not Detect Age-related Mnemonic Similarity Deficits in Rats

Ross, A.; Logan, C. N.; Thompson, J. J.; Johnson, S. A.; Watson, C.; Ramirez, M.; Lubke, K. N.; Maurer, A. P.; Burke, S. N. N.

2026-05-05 neuroscience 10.64898/2026.04.30.722044 medRxiv

Top 0.1%

1.9%

Show abstract

The Mnemonic Similarity Task (MST) is highly sensitive to age-related cognitive decline in humans and has been adapted for rodents using 3D objects, where aged animals show deficits in discriminating similar lures. To improve translational alignment with human testing and increase automation, we developed a touchscreen-based rat analog using a morphed Object-Cued Spatial Choice (OCSC) task with 2D image stimuli. Young (4-month) and aged (21-month) male and female Fischer 344 x Brown Norway hybrid rats were trained in Bussey-Saksida touchscreen chambers and tested on discrimination performance using image pairs that varied parametrically in feature overlap. We also assessed perirhinal cortical engagement in a subset of animals using Arc expression as a readout of activity-related principal cell firing following low-and high-overlap task epochs. Across shaping and procedural training, aged rats required more errors to reach criterion on one stimulus set, but both age groups successfully acquired the task. During morph testing, performance declined systematically as stimulus similarity increased, confirming that the task manipulated discrimination difficulty. However, contrary to expectations, young and aged rats performed similarly across overlap conditions, with no significant age-related impairment. In the Arc experiment, discrimination accuracy was again reduced by greater stimulus overlap, but Arc expression in perirhinal cortex did not differ reliably by age or overlap condition, although expression was associated with behavioral accuracy and deep layers showed higher ensemble similarity than superficial layers. These findings indicate that, while the touchscreen morph OCSC task is sensitive to stimulus similarity, it does not detect the robust age-related mnemonic discrimination deficits previously observed with 3D object-based rodent MST paradigms, underscoring the importance of considering ethological relevance when designing translational cognitive assays.

9

The Two Lives of Visual Working Memory: Evidence for Distinct Conscious and Unconscious Representations.

Lipinska, A.; Ciupinska, K.; Rutiku, R.

2026-05-05 neuroscience 10.64898/2026.05.01.722131 medRxiv

Top 0.1%

1.7%

Show abstract

Visual working memory (vWM) is often linked to conscious experience and visual imagery, but it is typically described as a system that stores separate, independent items. These assumptions are difficult to reconcile, given the unified nature of conscious experience. Here, we test the hypothesis that vWM relies on at least two distinct representations: an underlying, unconscious memory trace and a consciously accessible, integrated representation. A total of 216 participants performed a change-detection task, in which they rated their perceptual awareness of the memory display during the maintenance interval. Critically, we manipulated the statistical properties of the displays (average item size and size variability) to probe sensitivity to unified ensemble-level structure. Results revealed a dissociation between subjective and objective measures. Perceptual awareness increased for displays with larger, more variable items, whereas objective performance improved for displays with smaller, less variable items. Despite this difference, subjective awareness still predicted performance, and even incorrect responses showed consistent biases rather than random guesses. Importantly, individual differences in imagery vividness (VVIQ) were selectively associated with subjective awareness and estimation bias, but not with objective correctness. These precision biases were further shaped by display statistics, suggesting that multiple representations can guide behavior. Together, our findings support a reinterpretation of vWM performance in which task responses can draw on both unconscious and consciously accessible representations. One possible explanation for these behavioral patterns is that subjective experience reflects integrated, ensemble-like representations, while objective performance depends more strongly on item-specific information. Public significance statementsWorking memory allows us to temporarily hold and use information, and differences in this ability are closely linked to broader cognitive skills such as intelligence. This study shows that these differences may not depend only on how much information people can store, but also on how they experience it: some individuals appear to rely more on consciously accessible, image-like representations, especially when memory is uncertain or prone to error. By demonstrating that subjective experience and the vividness of imagery can shape behavior independently of objective accuracy, these findings suggest that how we use memory may be as important as how much we can store, with implications for understanding individual differences in cognition.

10

Inhibition in motion: Test-retest reliability of inhibitory kinematics in a go/no-go mouse tracking task

Mahesan, D.; Sharma, K.; Weinerth, M. K.; Dhaka, S.; Meinzer, M.; Fischer, R.

2026-05-09 neuroscience 10.64898/2026.05.06.722889 medRxiv

Top 0.1%

1.6%

Show abstract

Response inhibition, the ability to suppress contextually inappropriate actions, is a cornerstone of cognitive control and is commonly assessed using paradigms such as the go/no-go task. However, traditional go/no-go paradigms rely on binary outcomes such as commission errors, which offer limited insight into the dynamic, graded behavioral adjustments underlying successful stopping. The present study developed a novel mouse-tracking go/no-go paradigm with a dynamic start to capture inhibitory processes during ongoing execution. Twenty-three healthy young adults completed the task in two sessions separated by approximately one week to evaluate the test-retest reliability of standard behavioral measures (error rates and reaction times), and three kinematic features: path length, mean velocity, and mean acceleration. Results revealed robust differences between go and no-go trials across all measures. Successful inhibition was characterized by significantly shorter path lengths and reduced mean velocity and acceleration compared to go trials. Critically, all measures demonstrated moderate-to-good test-retest reliability across sessions, with intraclass correlation coefficients ranging from .75 to .85 for go trials and from .59 to .83 for no-go trials. These findings establish construct validity and psychometric reliability of the current mouse-tracking go/no-go paradigm. The demonstrated stability of these measures provides the methodological foundation for their use in cross-sectional, longitudinal, and intervention research targeting inhibitory control.

11

Pretraining Objective Shapes Cross-Category Generalization in Affective Image Prediction: A Geometric Comparison of Vision Transformer Encoders

Tsuchimoto, S.; Okazaki, Y. O.; Yuasa, K.; Nishijima, S.; Izumiya, M.; Hagihara, M.; Fujihira, R.; Kitajo, K.

2026-05-13 neuroscience 10.64898/2026.05.11.724194 medRxiv

Top 0.1%

1.6%

Show abstract

The geometry of representations learned by deep neural networks is shaped jointly by architecture and pretraining objective, yet disentangling these two factors remains difficult. Here we isolate the contribution of pretraining objective by comparing two Vision Transformers from the same backbone family but trained under different objectives: language-image contrastive learning (CLIP) and ImageNet-21k classification. Using continuous Valence-Arousal prediction on the OASIS dataset as a probe of representational quality, we evaluated frozen features under Leave-One-Theme-Out and Leave-One-Category-Out cross-validation, the latter requiring extrapolation to entirely unseen semantic categories. The contrastively pretrained encoder generalized substantially better than the classification-pretrained encoder under both protocols, with the gap widening sharply when held-out categories required cross-category generalization. To characterize why the two representations differ, we developed a geometric analysis of prediction errors, treating per-image errors as vectors in the affective plane and quantifying their spatial structure via weighted phase-locking, trajectory-based occupancy entropy, and effective dimensionality. The classification-pretrained representation collapsed errors into a small number of attractor regions with a strong center-ward pull, whereas the language-aligned representation distributed errors broadly across the affective space. Layer-wise linear probing further revealed that affective information was distributed across depth in the contrastive encoder but increasingly concentrated in deeper layers of the classification encoder, mirroring the texture-bias and category-anchored statistics characteristic of ImageNet-trained representations. These results provide a representation-geometric account of how the choice of pretraining objective, holding architecture constant, determines whether learned features generalize across semantic boundaries or remain confined to category-bound visual regularities. HighlightsO_LIIsolate the effect of pretraining objective by holding the Vision Transformer backbone constant. C_LIO_LIContrastively pretrained features generalize across unseen semantic categories where classification-pretrained features fail. C_LIO_LIIntroduce a geometric analysis of prediction errors based on phase-locking and occupancy entropy. C_LIO_LIClassification pretraining produces concentrated error attractors and a rigid centerward bias. C_LIO_LIAffective information is distributed across depth in CLIP but localized in late layers of the classification ViT. C_LI

12

Novel tool use does not depend on mechanical reasoning: evidence from apraxia

Du, Y.; Thibault, S.; Yates, J.; Buxbaum, L. J.; Krakauer, J. W.; Wong, A.

2026-05-18 neuroscience 10.64898/2026.05.14.724638 medRxiv

Top 0.1%

1.6%

Show abstract

A hallmark of human intelligence is the ability to use tools. Yet the cognitive processes supporting this ability remain debated. One contemporary view holds that mechanical reasoning is central for tool use, especially in the case of tools with which we have no prior experience. However, previous support for the role of mechanical reasoning often relies on circular logic, wherein poor performance on novel tool-use tasks is taken as evidence that impaired mechanical reasoning causes tool-use deficits in limb apraxia. To address this limitation, we independently assessed mechanical reasoning and novel tool use in separate tasks in individuals with limb apraxia, and compared their performance to individuals without apraxia. We also examined whether these two abilities are similarly associated with other cognitive abilities including motor imagery, mental rotation of non-body objects, general reasoning, and spatial working memory. Finally, we explored brain-behavior relationships using support vector regression lesion-symptom mapping. Our behavioral and imaging data together showed that mechanical reasoning does not underlie novel tool-use deficits in apraxia. Graphical analysis further revealed that novel tool use and mechanical reasoning loaded onto distinct latent clusters: novel tool use was strongly associated with other praxis abilities yet separable from cognitive abilities that require reasoning and mental simulation, whereas mechanical reasoning was primarily linked to other high-level reasoning abilities but not tool use. These findings challenge the notion that mechanical reasoning is central to tool-use ability, and instead suggest that tool use is more likely to be an intuitive or automatic process.

13

MISP-Bench: Decomposing User-Provided False Priors into Answer, Rationale, and Guard Effects

Jeong, I.; Kim, Y.; Park, J.-H.; Lee, H.

2026-05-10 health informatics 10.64898/2026.05.07.26352627 medRxiv

Top 0.1%

1.5%

Show abstract

Large language models in clinical and educational settings routinely receive user-provided context containing incorrect prior beliefs. Existing benchmarks measure aggregate susceptibility to such priors but do not disentangle which structural com-ponent (the asserted answer, the supporting rationale, or their combination) drives the damage, nor test whether safety meta-prompts such as "verify the reasoning first" consistently mitigate it. We introduce MISP-Bench, a factorial benchmark of 1,724 audited multiple-choice items (1,430 MedMCQA medical + 294 GSM8K quantitative) evaluated under 13 prompt conditions across 10 open-weight instruction-tuned models (1B-27B) in chain-of-thought and direct modes, with approximately 1.33M audited response records across three runs per condition. Distractors were generated by GPT-5.4 and the model was excluded from the evaluated set to prevent circular evaluation. Targeted and arbitrary distractor subsets yield similar aggregate Misinformation Damage Index (MDI; accuracy drop relative to a distractor-free baseline) at +19.7 vs +20.4 pp but diverge by 39.1 pp in sycophancy rate (78.4% vs 39.3%). The subsets differ in baseline difficulty, so this is a between-subset composition gap rather than a within-item causal effect. The combined answer-plus-rationale attack exhibits sub-additive saturation (+20.3 pp observed vs +24.5 pp expected under independence; 7/10 models sub-additive, 2 additive, 1 super-additive). Verification-style safety guards split models into three groups by sign at =0.05 (4 reversal, 3 recovery, 3 null), while source-independence and explicit-override guards yield positive recovery in 8/10 and 9/10 models. A six-category audit excludes 770 items, including 732 multi-correct items structurally incompatible with single-best-answer evaluation. The audit list is reusable beyond MISP-Bench. The corpus, response records, notebooks, and audit are released on Hugging Face Datasets (https://huggingface.co/datasets/yh0502/risp-bench) under CC-BY-4.0 (with original-source license inheritance for MedMCQA Apache-2.0 and GSM8K MIT content) with Croissant RAI metadata, with companion code at https://github.cor/anon-risp-2026/risp-bench.

14

A computational account of how positive performance bias supports cognitive effort

Mori, K.; Yamada, M.

2026-05-18 neuroscience 10.64898/2026.05.13.725021 medRxiv

Top 0.1%

1.5%

Show abstract

The willingness to exert cognitive effort is essential but is constrained by the subjective cost of effort. Although effortful tasks are often avoided, positive bias about ones own performance may help sustain engagement with cognitive demands. Here, participants completed an effort-based decision-making task and reported trial-by-trial predictions of their own performance, allowing us to quantify performance prediction error (PPE) as the discrepancy between subjective and objective accuracy. The results showed that PPE was predominantly positive and increased with effort level, indicating greater overestimation under higher cognitive demands. Using a computational model, we show that choices were best explained by a learning model in which rewarded trials accompanied by positive PPE decreased subsequent sensitivity to effort. A confidence-based control model did not provide a better account of choices, suggesting that this effect was better captured by positive performance bias than by confidence alone. Our findings provide a computational account of how biased self-evaluation may attenuate the subjective cost of cognitive effort and extend the positive bias literature to the task need for cognitive effort.

15

Cannot, Should Not, Did Anyway: Benchmarking Constraint Enforcement Failure in Frontier LLMs

Haq, S. M.; Nadeem, S.

2026-05-24 health informatics 10.64898/2026.05.20.26353719 medRxiv

Top 0.1%

1.5%

Show abstract

Large language models are typically evaluated under fixed instruction contexts, implicitly treating correct refusal as a stable model property. We show that this obscures a critical failure mode: models often recognize that a request should be refused, yet comply when the surrounding instructions exert sufficient pressure. To measure this behavior directly, we introduce FrameProbe, a framework that holds task content fixed while systematically varying instruction context, and instantiate it in KnowDoBench, a benchmark of 221 physician-validated clinical scenarios with rule-based ground truth. Cases span two constraint types: epistemic (unsolvable due to missing information) and normative (ethically or professionally prohibited). Across ten frontier models, constraint recognition is near ceiling under low-pressure conditions, yet performance degrades sharply as instructional pressure increases. Under coercive institutional framing, most models comply on cases they had previously refused, and normative constraints degrade roughly 20 percentage points more than epistemic constraints. This normative inversion suggests that verbal knowledge of a boundary does not guarantee robust behavioral enforcement under pressure. Failure analysis reveals that these errors are often not silent. Some models comply immediately without acknowledgment; others explicitly identify the violated constraint before answering anyway. This second pattern, which we term rationalized compliance, is invisible to standard refusal-rate metrics and highlights a dissociation between represented knowledge and behavior under pressure. Together, these findings show that refusal robustness is not a fixed capability but a context-dependent behavior. Evaluating it requires varying instruction framing systematically, not only measuring performance at a single prompt setting.

16

Aging selectively impairs how peripheral vision calibrates anticipatory postural responses to object motion

Sinha, O.; Kurtzer, I.; Singh, T.

2026-05-12 neuroscience 10.64898/2026.05.07.723563 medRxiv

Top 0.1%

1.4%

Show abstract

Anticipatory postural adjustments (APAs) scale with velocity of approaching objects, with scaling magnitude depending on whether the moving object is actively foveated and tracked, processed through fixated peripheral vision, or processed through fixated central vision. Aging preferentially degrades the magnocellular pathway underlying peripheral motion processing while sparing the extraretinal signals available during smooth pursuit. We therefore asked whether the effect of aging on velocity-dependent APA scaling differs across these three visual pathways. Eighteen young and eighteen older adults stopped a virtual object approaching at four velocities (15-33 cm/s) under three gaze conditions: active foveation via smooth pursuit, central fixation, and peripheral fixation. We measured peak anticipatory force, rate of force development, and time to contact at force onset. Despite reduced smooth pursuit gain in older adults, velocity-dependent scaling was equivalent between age groups during active foveation, and minimal in both groups during central fixation. Critically, young adults scaled force rate during peripheral fixation nearly as steeply as during active foveation, whereas older adults slope was significantly lower -- a difference not observed during the other gaze conditions. Older adults achieved comparable peak force by initiating responses earlier. These results establish that age-related decline in anticipatory motor control is pathway-specific: aging selectively impairs peripheral motion processing for APAs, while extraretinal mechanisms remain capable of sustaining velocity-dependent scaling. More broadly, peripheral motion processing emerges as a candidate physiological locus of age-related postural vulnerability, raising the question of whether magnocellular-targeted training can restore APA scaling in older adults. Key PointsO_LIYoung and older adults stopped virtual objects under three visual conditions: active foveation via smooth pursuit eye movements, and stationary gaze with the object moving through either central or peripheral vision. C_LIO_LIVelocity-dependent force rate scaling was preserved during active foveation in both age groups, minimal during fixated central vision in both age groups, and selectively impaired in older adults during fixated peripheral vision. C_LIO_LIWe found an age-induced vulnerability in peripheral visual motion processing for anticipatory posture stabilization. C_LI

17

The Georgetown Reading in Aging Neuroimaging Dataset (GRAND): Reading and multimodal MRI data in older adults

Anderson, E. J.; Staples, R.; Dyslin, S. M.; Chang, E. H. T.; Laks, A. B.; Dickens, J. V.; Mathur, D.; Paul, S.; Dvorak, E.; Turkeltaub, P.

2026-05-21 neuroscience 10.64898/2026.05.18.725986 medRxiv

Top 0.1%

1.3%

Show abstract

Reading is a critical skill in modern society. Most research on reading is conducted in school age children or young adults. However, acquired brain disorders often affect reading ability, and these disorders tend to occur in older adults. It is therefore critical to examine the normative distribution of reading behavior and the brain basis of reading in older adults. Here, we provide trial-wise single word and pseudoword oral reading and lexical decision data, as well as structural, functional, and diffusion-weighted MRI data from 116 neurotypical adults aged 22 to 84 years (mean = 59). Accuracy, response times, and errors are provided for corpora that are parametrically modulated in frequency, imageability, and regularity for real words and consistency of spelling-sound mapping for pseudowords. This dataset includes both minimally processed behavior (trial-wise data) and MRI data, and participant- and item-wise summary metrics and processed MRI data. These data serve both as a normative sample for reading behavior in older adults, but also as a valuable resource for identifying novel brain-behavioral relationships.

18

Thoughts-as-Planning: Latent World Models for Chain-of-Thoughts Optimization via Reinforcement Planning

Liu, D.; Yu, Y.; Wu, Y. N.

2026-05-15 neuroscience 10.64898/2026.05.10.724161 medRxiv

Top 0.1%

1.3%

Show abstract

The success of large language models (LLMs) across diverse NLP tasks has elevated the importance of reasoning chain optimization as a critical step in aligning model behavior with task objectives. Existing reasoning chain tuning methods often rely on black-box heuristics or gradient-free search, which lack interpretability, generalization, and sample efficiency. In this work, we introduce Thoughts-as-Planning, a novel framework that formalizes reasoning chain optimization as a sequential decision-making process over a latent semantic space. We model the LLM as a partially observable environment and learn a latent world model that simulates the effect of reasoning chain edits on downstream outputs. A proximity-preserving embedding space is constructed to encode reasoning chain-response dynamics, enabling planning via gradient descent or reinforcement learning. Our method supports multi-scale abstraction, allowing reasoning chain edits at token, segment, and instruction levels to be integrated into a unified planner. Through extensive experiments on language understanding and generation tasks, we demonstrate that Thoughts-as-Planning outperforms state-of-the-art reasoning chain tuning baselines in efficiency, robustness, and generalization, while offering interpretability through its structured planning trajectory. Our code is available at https://github.com/FastLM/Thoughts-as-Planning.

19

Characterizing the effect of aging on resting and event-evoked ocular response dynamics

Huviyetli, M.; Contadini-Wright, C.; Chait, M.

2026-05-09 neuroscience 10.64898/2026.05.06.723160 medRxiv

Top 0.2%

1.3%

Show abstract

Ocular measures are increasingly used as non-invasive proxies of cognitive processes such as attention and listening effort. However, their interpretation in aging populations is complicated by concurrent changes in ocular physiology and oculomotor control, raising a critical question: to what extent do age-related differences in these measures reflect cognitive rather than other physiological factors? Here, we dissociate these contributions by characterizing ocular dynamics (resting and event-evoked) during passive fixation in younger (N = 98, 18-35 years) and older adults (N = 71, 60+ years). Aging is associated with pronounced alterations in pupil dynamics, including reduced baseline variability and slower, attenuated responses to both auditory and visual events. In contrast, microsaccade dynamics did not correlate with aging. Across measures, ocular responses showed moderate-to-high within-subject stability across blocks, and factor analysis in the older cohort revealed separable components reflecting instantaneous pupil responsivity, sustained pupil responsivity, and microsaccade dynamics, with additional variance associated with sensory decline and age-related changes in pupil dynamics. Together, these findings demonstrate a clear dissociation: pupil-based metrics are strongly influenced by aging, whereas microsaccades remain comparatively stable across age groups. This dissociation provides a principled basis for interpreting ocular indices in aging research and highlights the need to account for baseline physiological differences when inferring cognitive processes from eye-based measures.

20

Augmenting the Bayesian Brain with learned and reusable world-model components for flexible cognition

Findling, C.; Lee, J. K.; Bakermans, J. J. W.; Pouget, A.; Wyart, V.

2026-05-08 neuroscience 10.64898/2026.05.06.722922 medRxiv

Top 0.2%

1.2%

Show abstract

The Bayesian Brain hypothesis assumes that cognition relies on internal generative models of the world, yet existing implementations remain constrained by pre-specified, task-specific generative structures and computationally heavy iterative inference schemes. Here, we introduce modular neural state-space models as a scalable realization of the Bayesian Brain, replacing fixed generative structures and pre-specified inference rules with learned world-model components and amortized neural updates. This framework preserves the core commitment to explaining observations through hidden causes while making inference learned and reusable rather than pre-specified and task-specific. Our modular implementation of these models affords learned components to be seamlessly recombined and stacked across superficially different tasks that share similar latent dynamics. Such computational reuse supports zero-shot generalization and predicts selective correlations of inference parameters between tasks. We confirm these key predictions in human behavior, identifying learned and reusable world-model components as a candidate computational principle for flexible cognition.